New Parallel Algorithms for Frequent Itemset Mining in Very Large Databases

نویسندگان

Adriano Veloso

Wagner Meira

Srinivasan Parthasarathy

چکیده

Frequent itemset mining is a classic problem in data mining. It is a non-supervised process which concerns in finding frequent patterns (or itemsets) hidden in large volumes of data in order to produce compact summaries or models of the database. These models are typically used to generate association rules, but recently they have also been used in far reaching domains like e-commerce and bio-informatics. Because databases are increasing in terms of both dimension (number of attributes) and size (number of records), one of the main issues in a frequent itemset mining algorithm is the ability to analyze very large databases. Sequential algorithms do not have this ability, especially in terms of run-time performance, for such very large databases. Therefore, we must rely on high performance parallel and distributed computing. We present new parallel algorithms for frequent itemset mining. Their efficiency is proven through a series of experiments on different parallel environments, that range from shared-memory multiprocessors machines to a set of SMP clusters connected together through a high speed network.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A New Algorithm for High Average-utility Itemset Mining

High utility itemset mining (HUIM) is a new emerging field in data mining which has gained growing interest due to its various applications. The goal of this problem is to discover all itemsets whose utility exceeds minimum threshold. The basic HUIM problem does not consider length of itemsets in its utility measurement and utility values tend to become higher for itemsets containing more items...

متن کامل

A Survey on Mining Algorithms

Data mining is a process that discover the knowledge or hidden pattern from large databases. In the large database using association rules throughfind meaningful relationship between large amount of itemsets and this itemset through create frequent itemset. Association rule mining is the most paramount application in the large database. Most of the Association rule mining algorithm are improved...

متن کامل

Parallelizing Frequent Itemset Mining with FP-Trees

A new scheme to parallelize frequent itemset mining algorithms is proposed. By using the extended conditional databases and k-prefix search space partitioning, our new scheme can create more parallel tasks with better balanced execution times. An implementation of the new scheme with FP-trees is presented. The results of the experimental evaluation showing the increased speedup are presented.

متن کامل

High Utility Itemset Mining

Data Mining can be defined as an activity that extracts some new nontrivial information contained in large databases. Traditional data mining techniques have focused largely on detecting the statistical correlations between the items that are more frequent in the transaction databases. Also termed as frequent itemset mining , these techniques were based on the rationale that itemsets which appe...

متن کامل

AMKIS: An Algorithm for Association Mining

Mining frequent items and itemsets is a daunting task in large databases and has attracted research attention in recent years. Generating specific itemset, K –itemset having K items, is an interesting research problem in data mining and knowledge discovery. In this paper, we propose an algorithm for finding K itemset frequent pattern generation in large databases which is named as AMKIS. AMKIS ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2003

New Parallel Algorithms for Frequent Itemset Mining in Very Large Databases

نویسندگان

چکیده

منابع مشابه

A New Algorithm for High Average-utility Itemset Mining

A Survey on Mining Algorithms

Parallelizing Frequent Itemset Mining with FP-Trees

High Utility Itemset Mining

AMKIS: An Algorithm for Association Mining

عنوان ژورنال:

اشتراک گذاری